Measuring Performance when Positives are Rare : Relative Advantage versus Predictive
نویسندگان
چکیده
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky-like grammar representations are useful for learning accurate comprehensible predic-tors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CPro-gol is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Performance is measured using both predictive accuracy and a new cost function, Relative Advantage (RA). The RA results show that searching for NPPs by using our best NPP predictor as a lter is more than 100 times more ee-cient than randomly selecting proteins for synthesis and testing them for biological activity. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.
منابع مشابه
Measuring Performance when Positives Are Rare: Relative Advantage versus Predictive Accuracy - A Biological Case Study
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomsky like grammar representations are useful for learning accurate comprehensible predic tors of members of biological sequence families The positive only learn ing framework of the Inductive Logic Programming ILP system CPro gol is used to generate a grammar for recognising a class of ...
متن کاملMeasuring Performance when Positives are Rare
This paper presents a new method of measuring performance when positives are rare and investigates whether Chomskylike grammar representations are useful for learning accurate comprehensible predictors of members of biological sequence families. The positive-only learning framework of the Inductive Logic Programming (ILP) system CProgol is used to generate a grammar for recognising a class of p...
متن کاملLearning Chomsky-like Grammars for Biological Sequence Families
This paper presents a new method of measur ing performance when positives are rare and investigates whether Chomsky like grammar representations are useful for learning accu rate comprehensible predictors of members of biological sequence families The positive only learning framework of the Inductive Logic Programming ILP system CProgol is used to generate a grammar for recognis ing a class of ...
متن کاملModel Predictive Inferential Control of a Distillation Column
Typical production objectives in distillation process require the delivery of products whose compositions meet certain specifications. The distillation control system, therefore, must hold product compositions as near the set points as possible in faces of upset. In this project, inferential model predictive control, that utilizes an artificial neural network estimator and model predictive cont...
متن کاملPresenting a Hybrid Approach based on Two-stage Data Envelopment Analysis to Evaluating Organization Productivity
Measuring the performance of a production system has been an important task in management for purposes of control, planning, etc. Lord Kelvin said :“When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind.” Hence, manag...
متن کامل